


RefOKT RESUMES 

ED 015 006 PS 000 eos 

OEVCUOPMENT OF APPROPRIATE EVALUATION TECHNIQUES FOR 
SCREENING CHILDREN IN A HEAD START PROGRAM. A PILOT PROJECT. 
DY- BERGER I STANLEY I. 

REPORT NUMBER OEO-SIS 

EORS PRICE MF-$0.25 HC-$0.60 13P. 

DESCRIPTORS- CULTURAL DISADVANTAGEMENTi INTELLIGENCE TESTSi 
OTEST VALIDITY I TEST SELECTION. DEVALUATION TECHNIQUES. 
♦SCREENING TESTS. PROGRAM EVALUATION. PRESCHOOL TESTS. 
INTELLECTUAL DEVELOPMENT. VERBAL ABILITY. DEARLY EXPERIENCE. 
TEST RELIABILITY. DCOGNITIVE MEASUREMENT. MEASUREMENT 
INSTRUMENTS. HEAD START. PPVT. LEITER INTERNATIONAL. RAVEN 
PROGRESSIVE MATRICES. STANFORD BINET. 

THE PURPOSES OF THIS PILOT PROJECT WERE (1) TO ATTEMPT 
TO EVALUATE THE EFFECT OF THE LOCAL PROGRAM ON BOTH 
INDIVIDUAL CHILDREN AND THE GROUP AND (2) TO INVESTIGATE THE 
SENSITIVITY OF THE TEST INSTRUMENTS EMPLOYED IN EVALUATING 
SUCH A PROGRAM. SIXTY -ONE CHILDREN WERE ENROLLED IN THE LOCAL 
HEADSTART PROGRAM AND WERE ADMINISTERED THE STANFORD-BINET. 
LEITER INTERNATIONAL. RAVEN PROGRESSIVE MATRICES. AND PEABODY 
PICTURE VOCABULARY TESTS. IN ADDITION. 20 CHILDREN. SELECTED 
AT RANDOM FROM THE GROUP WERE TESTED BOTH BEFORE AND AFTER 
THE PROGRAM. RESULTS INDICATE (1) STATISTICALLY SIGNIFICANT 
IMPROVEMENT IN PERFORMANCE FOR THE 20 CHILDREN. (2) 

SIGNIFICANT CORRELATIONS AMONG THE VARIOUS TEST SCORES OF THE 
TOTAL GROUP. AND (3) PARTICULAR SENSITIVITY OF THE LEITER AND 
PEABODY TESTS IN REFLECTING CHANGES IN FUNCTIONING. 
IMPLICATIONS OF THE STUDY FOR FUTURE HEADSTART PROGRAMS AND 
ALSO FOR FURTHER RESEARCH WITH CULTURALLY DEPRIVED CHILDREN 
WERE DISCUSSED. (CO*D) 



T PS 000205 E0015006 




U.S. DEPARTMENT OF HEALTH, EDUCATION & WELFARE 
OFFICE OF EDUCATION 



C)t:d 






THIS DOCUMENT HAS BEEN REPRODUCED EXACTLY AS RECEIVED FROM THE 
PERSON OR ORGANIZATION ORIGINATING IT. POINTS OF VIEW OR OPINIONS 
STATED DO NOT NECESSARILY REPRESENT OFFICIAL OFFICE OF EDUUTION 
POSITION OR POLICY. 



PROJECT HEADSTART 



DevtlopiMnt of Approprtatt Evaluation Techniques for 
Screening Children in e Head Start Prograa - A Pilot Project 



Dr. Stanley I. Berger 
Project Coordinator 






» 









ACKNOWLEDGEMENTS 



I wish to express my gratitude to the Office of Economic Opportunity 
for making this project possible and to the personnel and teachers of Head- 
Start, especially Miss Sylvia Lapin, whose support for and interest in the 
project proved of invaluable assistance. My special appreciation goes to 
Hr, Neal Dye for his specific assistance in analysis of the data. 



Dr. Stanley X. Berger 



a 















i 



. % 







ABSTRACT 

Sixty-one children enrolled in the local Headstart program and were adminis- 
tered the Stanford-Binet, teiter International. Raven Progressive Matrices, and 
Peabody Picture Vocabulary testa. In addition. 20 Ss selected at random from the 
group were tested both before and after the program. Results indicate; (I) etatis- 
tically significant improvement in performance for the 20 Ss, (2) signiftcant cor- 
relations among the various test scores of the total group, and (3) particular 
sensitivity of the Leiter and Peabody tests in reflecting changes in furxtioning. 

Implications of the study for future Headstart programs, and also for further 
tasaarch with culturally deprived children, wata discussed. 




Introduction 



Since the early Iowa studies of the forties (Wellman, 1940, 1945) there has 
been an Increasing amount of literature suggesting that early pre-school experience ^ 
has a profound Impact upon the subsequent Intellectual development of the child. 

1' 

An evaluation of such a program must be concerned with the reliability and validity 
■ ■ . • ' . ; 
! . ’ . • 

I of the Instruments employed. Project Headstart thus creates several problems for 

i { the Investigator concerned wldi measuring the effect of the program on the cultur- 

ally deprived child. One such issue concerns the development of criteria to be used i 

for selection of truly "culturally deprived" children; l.e. , the selection of a 

homogenous group with regard to this variable. A second variable is related to 

Che nature of the Instruments utilized in evaluating the effects of the program on 

\ both individual and the group. | 

■ * - * ■ -i 

The cultural factor Involved In most Intelligence Instruments Is rather well 

• •• . • ■ • . j 

1 / * » ? 

established (Cronbach, I960). Instruments such as the Stanford-Blnet , which pri- . 

• . • '1 

marlly require. verbal ability, Illustrate the effect of cultural differences most 

clearly (Davis, 1951; Havlghurst and Janke, 1944, 1945; Eells, 1951; Thurstone, 

1951). This poses something of a dllemna In evaluating the Headstart program, since j 
the Blnet Is also recognized as the best single predictor of scholastic readiness 
available (Cronbach, 1960). Thus In developing a battery for this group It would 

I s^em necessary to include Instruments In which items are less heavily weighted for j 

\ ■ ■ ^ 

Verbal ability, but which yield reliable estimates of Intellectual potential. That 1 
is. Instruments should be Included which essentially correlate with Blnet IQ's. 

In light of the apparently tenable assumption that cultural deprivation would 
I vai*y both quantitatively and qualitatively with the particular geographic area, It 

^ seems further appropriate that some evaluation of the Instruments be carried out for^ 

.a given locale. Not only would this provide some Indication of their reliability ; 



I and Validity for this group, but it might also yield information regarding which 





instrument might optimally be employed to reflect their unique experiences. That It 
it is deslroble to ascertain which Instrument provides an optimal estimate of po- 

'It 

tential Intellectual ability of the deprived child. 

Thus the purposes of this pilot project were: 

1) an attempt to evaluate the effect of the local program on both 
individual children as well as the group, 

2) an Investigation of the test instruments employed in order to derive 
information regarding the sensitivity of such media in evaluating such 
a program. 

Method 

• 

Subjects . Sixty-one , five "year -old children of essentially rural, low-income 
background, who were enrolled in the pjrogram. . j 

Instruments . The traditional instrument yielding IQ scores employed wos the 

t 

Stanford-Binet (I960, L-M). Included also were the Lei ter International, Raver Prog- j 
ressive Matrices, and Peabody Picture Vocabulory tests, for essentially non-verbal 
indices of ability. A "pre-school inventory" was also given to most children. 

Procedure. All Ss were administered at least some of the tests in a period 
between one week prior to the start of the program - which lasted 7 weeks - through 
the end of the last week. In addition, 20 Sa were randomly selected from the group 
for pre- ond post-evaluations. These were given the above tests within one week of 
the a tart of the program, and again during the last week and one week thereafter. 

As a preliminary pilot study it was decided that the limited time available required 
total attention and activity be directed toward this group, and thus a control group • 

■ i ‘I 

could not be included. 





Results and Discussion 

A. Byaluation of total group . As scoring norms and standards wore not available 
for the pre-school inventory ho evaluation of the was attempted. 

Means and standard deviations were computed for scores on each test (see Table 
1 below). 



TABLE 1 

MEANS, STANDARD DEVIATIONS, AND RANGES FOR 
SC30RSS OF EACH INSTRUMENT 





N 


X 


<JX 


Range 


Binet 


53 


91.62 


11.65 


56-116 


Leiter 


59 


84.20 


10.88 


50-Ul 


Raven 


61 


4.09 


1.73 


0-8 


Peabody 


59 


82.80 


12.04 


35-110 



It should Immediately be noted that the obtained scores (means and sigmas) 
are inconsistent with what one would logically expect for a truly culturally deprived 
group. The extent to which the group deviates from expected poor scores would limit 
in itself the application of our findings as regards the culturally deprived child. 
Thus, the inclusion of ^?.hildren not culturally deprived could have an influence on 
all statistical analysis and interpretation, and all conclusions must be qualified 
with this in mind. In light of the lack of objective individual correlates of cul- 
tural deprivation, however, it is impossible to delete individual Ss from the analysif 
purely on the basis of high intelligence test scores or cultural background. 



B. Evaluation of the instruments . Pearson Product -Moment Correlation Coefficients 
were computed among scores on all tests (see Table 2 on the following page). 



w 

ERLC 


















^ I lumip 



wmm 



mmm 



pnpppiii 







TABLE 2 

PR0DUCT-MOMBNT INTER-CORRELATIONS AMONG SCORES ON EACH TEST 

(Ns in parens) 





Binet 


Leiter 


Raven 


Peabody 


Binet 




.625* (53) 


•554*(53) 


.5lO*(53) 


I^eiter 






.693*(59) 


.437 (59) 


Raven 








.412 (59) 



* significant at p " .05 



Although not extremely high, and despite a large standard error (.23), these 
data do indicate some degree of overlap. If the Binet can be accepted as being one 
of the more reliable available tests, then it would appear that, for this group, a 
reasonable estimate of intellectual potential can be obtained by employing other 
instruments. Being primarily interested in obtaining an optimal estimate of func*^ 
tloning, these data would seem to Justify the use of the Leiter and Peabody tests In 
a program of this sort, despite flaws in standardization data. This would become 
especially significant where a child indicates difficulty with verbal items. 



C. Evaluation of the pre- and post -program group . Matched-group t-tests computed 
between pre- and post-program test scores indicate statistically significant 7 ‘mprove- 
ment in all test scores except the Raven (see Table 3 on the following page). 

The pragmatic or clinical significance, however, of a 3.4 point goin ( as in 
eba I^Lnet) is questionable, especially since it is well within the standard error 
of measurement. 



o 

ERIC 



















■ 



wmm 












TABLE 3 



f tests bbtuben pre- and post-program test scores 

(N-20) 



Instrument 


d 


• n.i9 


p* 


Binet 


+3.4 


1.61 


.01 


Letter 


+10.55 


3,47 


.001 


Raven 


-0.35 


0.778 


.05 


Peabody 


+9.75 


3.61 


.001 



* otie-talled tests of significance 



In order to nore specifically assess any changes in performance » Binet items 

vere separated into those requiring primarily verbal, performance, and memory ability! 

(McNemar, 1942), and t-tests were conducted on the proportion of items passed in 

each category. No significant differences were found either between pre- and post- ] 

program Scores for each category, or among the categories when measured as units. J 
. . . ■ ■ ■ • . ' ' ' /'I 

In an attempt to compare the various scores for each individual , the group was 

ranked on each test according to their standardised score position relative to the 

grou,, for both the pre- and post-programs. Kendall Coefficients of Concordance - I 

**W” - (Siegel, 1956) computed among ranks Indicated no significant changes in rank 

* * ■ . 'j! 

Within the group or among tests, either pre- or post-program. That is, So tended to i 

■ ^ 

do as well or as poorly - relative to the group - on each test, both before and after 
their Headstart experience. It would appear, then, that improvement was approximatelyi! 
equal ror most Ss, and that no particular Instrument was easier or more difficult 
for the group or for the individual Ss. 

It should bo noted, however, that a very few individuals did respond uniqiiely, 

particularly in an "upward" direction. The small number involved precludes statis- j 
tical analysis. 



o ■ 

ERIC 



le is imi,o«.ible. in . .tudy of this sort, to sepsrste the effects of practice 
from those of true improvement from the Headstart experience. Also, questions about 
reUabiUty of ^ the instruments for this ago group necessitates caution in inter- 
preting these results. It c«,not. therefore, be stated at this time whether the 
improvement in scores is specifically due to the beneficial effects of Headstart, 
although On- Would suspect that the total gain in scores would not likely be due 
solely to practice, especially for such an age group. 

D. "Fr« association" coffients ma^ bjr psychometricians reflecting upon their testine 
ex Eeri Unc e w^ the children. Some of the children refused to cooperate and ! 
Mny of these were very low on the testing that was done. Hence there is a possibl- j 
llty that this ample obtained may not be representative, since many of the lower 
cases were not included. There did not seem to be any noticeable difference between 
male and female children. For some of the children there was a noticeably short 
attention span. They were quite hyperactive, couldn't sit through one test. Some 
had language handicaps and soma were almost unintelligible. Some were very shy and 
Withdrawn, especially during the pretesting period. D.H. felt that many of the 
children did do their best, did reach their potential and the testing was an adequate 
reflection of their abilities. A.H. felt that potentials were never tapped on the 
test because of the age level of the children and also because of the tests them- 
salves. Maturation level and motor ability affected some of the drawing tasks. 
Concerning the Stanford-Blnet itself, all testers agreed that this test could not be 
given in one session. When given in one session. Just at the time the test began to 
discriminate at the higher levels, the children almost always became tired and would 

I 

not cooperate or would give arbitrary answers. 

felt chat the picture Completion proved to be one of the better tests. 

The elaboration of the drawing usually correlated with the pverall ability of the 
Children. A.W., however, felt that this test was inadequate because of motivational 



I 



'5 
















factors. One of the children in this test attempted a very bizarre drawing with 
four claw, in place of a iaft arm. Later it waa diacovered that aha «aa Innoculatad ' 
pravioualy with a four-prong needle which clipped into her arm, hence explaining 
the bixarre drawing. In thia aenae , many of the reactiona and reaponaea were due 
to very practical reaaona, auch aa lack of .leap, poor diet, etc. , in aome of the 
children. However, a few of the children did produce definitely bizarre reaponaea 

which were apparent through all the total tooting indicating a need for clinical 
rafarrai. 



In sunnary, verbal ability was below the performance. In thia regard, post 
tasting showed improvement in this area. However, almost 7 out of 20 of the sample 
cases showed a negative trend, with post testing with the Stanford-Binet. Conditions 

I ■ 

fbr testing were far from optimal. Children were taken away from playing games and 
anack tima which they were enjoying, adding to the poor motivation previoualy men- | 
tinned. In many caaea, teating waa done in apite of the teacher’a help rather than 
with the teacher'a help. Occaalomlly they would cue children aa to how to reapond 
by Baying, "Thia tron't hurt a bit. Don't worry, he'a not going to hurt you." Thia 
waa probably duo to the lack of atructura in die program and the peraonnel not knobin 

*’ i 

really what was expected of them. 

• \ 

5 

., • J 

Summary and Conclusions 

i 

S®*ults of this study indicate the followings 

i 

(1) Hie group may not have been truly homogenous with respect to cultural 
deprivation. I 

• • * j i 

:.(2) A significant correlation exists among non«>verbal instruments and 
the Stanford-Binet. 

(3) Tests which arc primarily oriented towards evolving "performance” 
rather than verbal behavior appear to be more sensitive in reflecting 
change as a result of the childrens* becoming note familiar with a 
henigti "action" atlBOs^here. • 



• I 









o 

ERIC 






mm 






mmmm 









(4) A significant gain is achieved in test scores after completion of the 
Headstart program experience. It should be emphasized, however, that 
the limitation of the present design does not permit one to assess how 
much of the improvement might be due to practice effect. 

In general, the results of the study indicate that such a program has a probable; 

significant impact upon releasing cognitive attitudes and ’’sets" which are necessary i 

precursor mechanisms in the learning process. Thus, one sees striking change in 

te»t .cores, particularly In those tests which facilitate expression of ability in 

pathways most consistent with the child's own cognitive style. While one anticipatesi 

a relative constancy of IQ score, it clearly becomes necessary - particularly with 

{ I 

such a group - to have available instruments which are maximally sensitive in ref- 

* V 

lacting actual as well as potential levels of performance and changes in functioning. 
While it is possible that the obtained changes may be a function of greater or less 
test reliability, it is tenable to hold the view that some tests are actually more j 

sensitive than others in reflecting change. Thus the Leiter in particular, as well I 

as the Peabody Picture-Vocabulary, seem to be tests which reflect an ability to pick i 

* * 

up both gross as well as Subtle shifts in the child's reaction to structured co,^i- 
tlve taaks. If this position can be borne out by further, more rigorous study, it 
would be important to carefully pre-test any Instruments used in evaluating such a 
program as Headstart in order to determine the degree to which they might mirror 
actual changes occurring in the children. 

• I 

The improvement beyond chance expectation of some few Ss would seem to suggest 
the possibility of differential readiness for such a program as Headstart. Thi^ is j 
particularly evident in Instruments such as the Leiter and the Peabody, where less 
emphasis is placed upon strictly verbal ability, and intellectual potential and 
weaknesses may be more clearly and readily reflected. It would appear tenable to 

as8Ui;ne that some specific cognitive and personality correlates of cultural deprivation 

1 

do exist, and that these can be measured on a properly controlled and rigorously 
designed longitudinal study. The specific question one mi^t ask would be concerned ; i 



o 

ERIC 






ittiiliii 












with the pxrodiction of which children night benefit noet* or leeet» from e Hcadetert 
experience. Thie question could only be answered by including besides cognitive) 
instruments* some projective techniques and clinicr.t interviews. A longitudinal 
study would also allow for the inclusion of a nunber of external criteria* such as 
school grades* attitudes* etc. 

In conclusion* within the very real limits imposed upon the study by virtue 
of the lack of adequate controls* the question of homogeneity of the group regarding 
''cultural" deprivation* and the necessity to maintain time schedules as a critical 
factor* these data do support the assumption that the experience of the Headstart 
program can produce effective* positive results. 






















illPMmiiP! 



JP ppififilLlIL 









\ 



I 



«• 



1 

:i 



REFERENCES 

^CAbACht L.J. BiMntiAlt of Pivchologlcal Teitlng (2nd cd.)» New York: 

llAfp«r> I960. 

0«v$9, A. Socioeconomic influences upon childrens' lemming. Understanding 
tho Child . 1951, 20, 10-16. 



Itllt, X., et el. Intellinehce end Cultural Differences . Chicago: Univ. of 1 
Chicago Press, 1951. 

Riivighuret , R.J. end L.Z*. Jenke. Relations between ability and social statue; 
in a nidvestem conminlty. J. educ. Psychol., 1944, 35, 357-358; 1945 , 36 
499-509. 



Siogel , S. Nonparamatric Statistics for the Behavioral Sciences, New York: | 

HcCrav-Hill, 1956. " 

llBlIaaar, Q. Revision of the Stanford-Blnet Scale . Boston: Hough ton-Mifflin, 1 

1942. 

Tluifitonei L.L. Creative talent. Proceedings . 1950 Invitation Conference 
^ Testing Problems . Princeton: E.Z.S., 1951. 

Viltiiant L.L* ZQ chaagaa in pra-school and non-pre-school groups during the j 
9XR*RRtioel paarat a awnary of the literature, J. Psychol., 1945 , 20 , 347 

■••Mi. ■ 











J ■ 





\ 







MM 



MiMiiii 



MititllMiliMII 






